10 research outputs found

    Handwritten Documents Text Line Segmentation based on Information Energy

    Get PDF
    The first step in the text recognition process is represented by the text line segmentation procedures. Only after text lines are correctly identified can the process proceed to the recognition of individual characters. This paper proposes a line segmentation algorithm based on the computation of an information content level, called energy, for each pixel of the image and using it to execute the seam carving procedure. The algorithm proposes the identification of text lines which follow the text more accurately with the expected downside of the computational overhead

    A Study of Image Upsampling and Downsampling Filters

    No full text
    In this paper, a set of techniques used for downsampling and upsampling of 2D images is analyzed on various image datasets. The comparison takes into account a significant number of interpolation kernels, their parameters, and their algebraical form, focusing mostly on linear interpolation methods with symmetric kernels. The most suitable metrics for measuring the performance of upsampling and downsampling filters’ combinations are presented, discussing their strengths and weaknesses. A test benchmark is proposed, and the obtained results are analyzed with respect to the presented metrics, offering explanations about specific filter behaviors in general, or just in certain circumstances. In the end, a set of filters and parameters recommendations is offered based on extensive testing on carefully selected image datasets. The entire research is based on the study of a large set of research papers and on a solid discussion of the underlying signal processing theory

    Adaptive Video Transmission Using Residue Octree Cubes

    No full text
    This paper proposes a method of transmitting video streaming data based on downsampling-upsampling pyramidal decomposition. By implementing an octal tree decomposition of the frame cubes, prior to transforming them into hypercubes, the algorithm manages to increase the granularity of the transmitted data. In this sense, the communication relies on a series of smaller hypercubes, as opposed to a single hypercube containing the entire, undivided frames form a sequence. This translates into increased adaptability to the variations of the transmitting channel’s bandwidth

    The MOSAICS Model of Educational Approaches for Teaching the Practice of Software Project Management

    No full text
    Maybe you heard the line “managing programmers is like herding cats”, and if you consider there is some truth behind this, then you should, perhaps, think how it is to teach people to perform this job. As we know from the research literature, there is no such thing as a consensus about the most suitable teaching method of a software project management course targeted to information technology students. Moreover, the majority of publications focus on the theoretical aspects of the course, thus leaving little details about the application of the theory, or how to experience the practical side. The paper at hand proposes an abstract model of educational approaches, suggestively named MOSAICS, which may be used in teaching the practical side of a software project management course

    Fast and Robust People Detection in RGB Images

    No full text
    People detection in images has many uses today, ranging from face detection algorithms used by social networks to help the users tag other people, to surveillance systems that can create a statistic of the population density in an area, or identify a suspect, or even in the automotive industry as part of the Pedestrian Crash Avoidance Mitigation (PCAM) system. This work focuses on creating a fast and reliable object detection algorithm that will be trained on scenes that depict people in an indoor environment, starting from an existing state-of-the-art approach. The proposed method improves upon the You Only Look Once version 4 (YOLOv4) network by adding a region of interest classification and regression branch such as Faster R-CNN’s head. The candidate bounding boxes proposed by YOLOv4 are ranked based on their confidence score, the best candidates being kept and sent as input to the Faster Region-Based Convolutional Neural Network (R-CNN) head. To keep only the best detections, non-maximum suppression is applied to all proposals. This decreases the number of false-positive candidate bounding boxes, the low-confidence detections of the regression and classification branch being eliminated by the detections of YOLOv4 and vice versa in the non-maximum suppression step. This method can be used as the object detection algorithm in an image-based people tracking system, namely Tracktor, having a higher inference speed than Faster R-CNN. Our proposed method manages to achieve an overall accuracy of 95% and an inference time of 22 ms

    Fast and Robust People Detection in RGB Images

    No full text
    People detection in images has many uses today, ranging from face detection algorithms used by social networks to help the users tag other people, to surveillance systems that can create a statistic of the population density in an area, or identify a suspect, or even in the automotive industry as part of the Pedestrian Crash Avoidance Mitigation (PCAM) system. This work focuses on creating a fast and reliable object detection algorithm that will be trained on scenes that depict people in an indoor environment, starting from an existing state-of-the-art approach. The proposed method improves upon the You Only Look Once version 4 (YOLOv4) network by adding a region of interest classification and regression branch such as Faster R-CNN’s head. The candidate bounding boxes proposed by YOLOv4 are ranked based on their confidence score, the best candidates being kept and sent as input to the Faster Region-Based Convolutional Neural Network (R-CNN) head. To keep only the best detections, non-maximum suppression is applied to all proposals. This decreases the number of false-positive candidate bounding boxes, the low-confidence detections of the regression and classification branch being eliminated by the detections of YOLOv4 and vice versa in the non-maximum suppression step. This method can be used as the object detection algorithm in an image-based people tracking system, namely Tracktor, having a higher inference speed than Faster R-CNN. Our proposed method manages to achieve an overall accuracy of 95% and an inference time of 22 ms

    Robust Lane Detection and Tracking Algorithm for Steering Assist Systems

    No full text
    Modern vehicles rely on a multitude of sensors and cameras to both understand the environment around them and assist the driver in different situations. Lane detection is an overall process as it can be used in safety systems such as the lane departure warning system (LDWS). Lane detection may be used in steering assist systems, especially useful at night in the absence of light sources. Although developing such a system can be done simply by using global positioning system (GPS) maps, it is dependent on an internet connection or GPS signal, elements that may be absent in some locations. Because of this, such systems should also rely on computer vision algorithms. In this paper, we improve upon an existing lane detection method, by changing two distinct features, which in turn leads to better optimization and false lane marker rejection. We propose using a probabilistic Hough transform, instead of a regular one, as well as using a parallelogram region of interest (ROI), instead of a trapezoidal one. By using these two methods we obtain an increase in overall runtime of approximately 30%, as well as an increase in accuracy of up to 3%, compared to the original method

    An Efficient System for Eye Movement Desensitization and Reprocessing (EMDR) Therapy: A Pilot Study

    No full text
    In this paper, we describe an actuator-based EMDR (eye movement desensitization and reprocessing) virtual assistant system that can be used for the treatment of participants with traumatic memories. EMDR is a psychological therapy designed to treat emotional distress caused by a traumatic event from the past, most frequently in post-traumatic stress disorder treatment. We implemented a system based on video, tactile, and audio actuators which includes an artificial intelligence chatbot, making the system capable of acting autonomously. We tested the system on a sample of 31 participants. Our results showed the efficiency of the EMDR virtual assistant system in reducing anxiety, distress, and negative cognitions and emotions associated with the traumatic memory. There are no such systems reported in the existing literature. Through the present research, we fill this gap by describing a system that can be used by patients with traumatic memories
    corecore